<!DOCTYPE html>
<html lang="en-US">
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <title>《统计学习方法》第 6 章“最大熵模型”学习笔记 | 算法不好玩</title>
    <meta name="generator" content="VuePress 1.8.2">
    <link rel="icon" href="/book/logo.png">
    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.7.1/katex.min.css">
    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/github-markdown-css/2.10.0/github-markdown.min.css">
    <meta name="description" content="">
    
    <link rel="preload" href="/book/assets/css/0.styles.e159a327.css" as="style"><link rel="preload" href="/book/assets/js/app.a4355b0d.js" as="script"><link rel="preload" href="/book/assets/js/2.ba785ee6.js" as="script"><link rel="preload" href="/book/assets/js/47.e7767237.js" as="script"><link rel="prefetch" href="/book/assets/js/10.b6950be1.js"><link rel="prefetch" href="/book/assets/js/11.f4ee7d6e.js"><link rel="prefetch" href="/book/assets/js/12.50fe1ace.js"><link rel="prefetch" href="/book/assets/js/13.42356fcc.js"><link rel="prefetch" href="/book/assets/js/14.e740c742.js"><link rel="prefetch" href="/book/assets/js/15.05c628f7.js"><link rel="prefetch" href="/book/assets/js/16.bd9bc9d5.js"><link rel="prefetch" href="/book/assets/js/17.e120b4fc.js"><link rel="prefetch" href="/book/assets/js/18.398213f8.js"><link rel="prefetch" href="/book/assets/js/19.e29ad0b2.js"><link rel="prefetch" href="/book/assets/js/20.14e30c1c.js"><link rel="prefetch" href="/book/assets/js/21.4f05f6c9.js"><link rel="prefetch" href="/book/assets/js/22.98fbf199.js"><link rel="prefetch" href="/book/assets/js/23.285387f4.js"><link rel="prefetch" href="/book/assets/js/24.852addbe.js"><link rel="prefetch" href="/book/assets/js/25.d30ade13.js"><link rel="prefetch" href="/book/assets/js/26.23dfa040.js"><link rel="prefetch" href="/book/assets/js/27.b6eae7df.js"><link rel="prefetch" href="/book/assets/js/28.97878b08.js"><link rel="prefetch" href="/book/assets/js/29.7217a3d0.js"><link rel="prefetch" href="/book/assets/js/3.f0c15194.js"><link rel="prefetch" href="/book/assets/js/30.0ced5a5a.js"><link rel="prefetch" href="/book/assets/js/31.8f432033.js"><link rel="prefetch" href="/book/assets/js/32.aac9aa62.js"><link rel="prefetch" href="/book/assets/js/33.db835477.js"><link rel="prefetch" href="/book/assets/js/34.a1a59c2a.js"><link rel="prefetch" href="/book/assets/js/35.ed2ef96b.js"><link rel="prefetch" href="/book/assets/js/36.48099c55.js"><link rel="prefetch" href="/book/assets/js/37.32827612.js"><link rel="prefetch" href="/book/assets/js/38.58abec0e.js"><link rel="prefetch" href="/book/assets/js/39.f6eb3874.js"><link rel="prefetch" href="/book/assets/js/4.13ea3b32.js"><link rel="prefetch" href="/book/assets/js/40.80523ed8.js"><link rel="prefetch" href="/book/assets/js/41.f7b72b7a.js"><link rel="prefetch" href="/book/assets/js/42.44e40de9.js"><link rel="prefetch" href="/book/assets/js/43.065f077f.js"><link rel="prefetch" href="/book/assets/js/44.386d9aea.js"><link rel="prefetch" href="/book/assets/js/45.d8a88f16.js"><link rel="prefetch" href="/book/assets/js/46.33bc2083.js"><link rel="prefetch" href="/book/assets/js/48.9bb76138.js"><link rel="prefetch" href="/book/assets/js/49.c2ca5c29.js"><link rel="prefetch" href="/book/assets/js/5.32eda204.js"><link rel="prefetch" href="/book/assets/js/50.1242fe24.js"><link rel="prefetch" href="/book/assets/js/51.e8f8117d.js"><link rel="prefetch" href="/book/assets/js/52.eae5648f.js"><link rel="prefetch" href="/book/assets/js/53.39876d83.js"><link rel="prefetch" href="/book/assets/js/6.1c662163.js"><link rel="prefetch" href="/book/assets/js/7.29470445.js"><link rel="prefetch" href="/book/assets/js/8.814b737f.js"><link rel="prefetch" href="/book/assets/js/9.d4211262.js">
    <link rel="stylesheet" href="/book/assets/css/0.styles.e159a327.css">
  </head>
  <body>
    <div id="app" data-server-rendered="true"><div class="theme-container no-sidebar"><header class="navbar"><div class="sidebar-button"><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" role="img" viewBox="0 0 448 512" class="icon"><path fill="currentColor" d="M436 124H12c-6.627 0-12-5.373-12-12V80c0-6.627 5.373-12 12-12h424c6.627 0 12 5.373 12 12v32c0 6.627-5.373 12-12 12zm0 160H12c-6.627 0-12-5.373-12-12v-32c0-6.627 5.373-12 12-12h424c6.627 0 12 5.373 12 12v32c0 6.627-5.373 12-12 12zm0 160H12c-6.627 0-12-5.373-12-12v-32c0-6.627 5.373-12 12-12h424c6.627 0 12 5.373 12 12v32c0 6.627-5.373 12-12 12z"></path></svg></div> <a href="/book/" class="home-link router-link-active"><!----> <span class="site-name">算法不好玩</span></a> <div class="links"><div class="search-box"><input aria-label="Search" autocomplete="off" spellcheck="false" value=""> <!----></div> <nav class="nav-links can-hide"><div class="nav-item"><a href="/book/" class="nav-link">
  主页
</a></div><div class="nav-item"><a href="/book/guide/" class="nav-link">
  关于我
</a></div><div class="nav-item"><div class="dropdown-wrapper"><button type="button" aria-label="基础算法与数据结构" class="dropdown-title"><span class="title">基础算法与数据结构</span> <span class="arrow down"></span></button> <button type="button" aria-label="基础算法与数据结构" class="mobile-dropdown-title"><span class="title">基础算法与数据结构</span> <span class="arrow right"></span></button> <ul class="nav-dropdown" style="display:none;"><li class="dropdown-item"><h4>
          排序算法
        </h4> <ul class="dropdown-subitem-wrapper"><li class="dropdown-subitem"><a href="/book/algs/01-binary-search/" class="nav-link">
  第 1 章 二分查找
</a></li><li class="dropdown-subitem"><a href="/book/algs/recursion/" class="nav-link">
  第 2 章 递归
</a></li></ul></li><li class="dropdown-item"><h4>
          数组里的算法
        </h4> <ul class="dropdown-subitem-wrapper"><li class="dropdown-subitem"><a href="/book/algs/01-binary-search/" class="nav-link">
  第 4 章 滑动窗口
</a></li><li class="dropdown-subitem"><a href="/book/algs/02-basic-sorting/" class="nav-link">
  第 5 章 双指针
</a></li></ul></li></ul></div></div><div class="nav-item"><div class="dropdown-wrapper"><button type="button" aria-label="威威道来" class="dropdown-title"><span class="title">威威道来</span> <span class="arrow down"></span></button> <button type="button" aria-label="威威道来" class="mobile-dropdown-title"><span class="title">威威道来</span> <span class="arrow right"></span></button> <ul class="nav-dropdown" style="display:none;"><li class="dropdown-item"><!----> <a href="/book/talk-show/2021-04/" class="nav-link">
  2021 年 4 月
</a></li></ul></div></div><div class="nav-item"><a href="https://leetcode-cn.com/" target="_blank" rel="noopener noreferrer" class="nav-link external">
  力扣
  <span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></div><div class="nav-item"><a href="https://gitee.com/liweiwei1419/book/pages" target="_blank" rel="noopener noreferrer" class="nav-link external">
  Gitee 部署页面
  <span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></div> <!----></nav></div></header> <div class="sidebar-mask"></div> <aside class="sidebar"><nav class="nav-links"><div class="nav-item"><a href="/book/" class="nav-link">
  主页
</a></div><div class="nav-item"><a href="/book/guide/" class="nav-link">
  关于我
</a></div><div class="nav-item"><div class="dropdown-wrapper"><button type="button" aria-label="基础算法与数据结构" class="dropdown-title"><span class="title">基础算法与数据结构</span> <span class="arrow down"></span></button> <button type="button" aria-label="基础算法与数据结构" class="mobile-dropdown-title"><span class="title">基础算法与数据结构</span> <span class="arrow right"></span></button> <ul class="nav-dropdown" style="display:none;"><li class="dropdown-item"><h4>
          排序算法
        </h4> <ul class="dropdown-subitem-wrapper"><li class="dropdown-subitem"><a href="/book/algs/01-binary-search/" class="nav-link">
  第 1 章 二分查找
</a></li><li class="dropdown-subitem"><a href="/book/algs/recursion/" class="nav-link">
  第 2 章 递归
</a></li></ul></li><li class="dropdown-item"><h4>
          数组里的算法
        </h4> <ul class="dropdown-subitem-wrapper"><li class="dropdown-subitem"><a href="/book/algs/01-binary-search/" class="nav-link">
  第 4 章 滑动窗口
</a></li><li class="dropdown-subitem"><a href="/book/algs/02-basic-sorting/" class="nav-link">
  第 5 章 双指针
</a></li></ul></li></ul></div></div><div class="nav-item"><div class="dropdown-wrapper"><button type="button" aria-label="威威道来" class="dropdown-title"><span class="title">威威道来</span> <span class="arrow down"></span></button> <button type="button" aria-label="威威道来" class="mobile-dropdown-title"><span class="title">威威道来</span> <span class="arrow right"></span></button> <ul class="nav-dropdown" style="display:none;"><li class="dropdown-item"><!----> <a href="/book/talk-show/2021-04/" class="nav-link">
  2021 年 4 月
</a></li></ul></div></div><div class="nav-item"><a href="https://leetcode-cn.com/" target="_blank" rel="noopener noreferrer" class="nav-link external">
  力扣
  <span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></div><div class="nav-item"><a href="https://gitee.com/liweiwei1419/book/pages" target="_blank" rel="noopener noreferrer" class="nav-link external">
  Gitee 部署页面
  <span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></div> <!----></nav>  <!----> </aside> <main class="page"> <div class="theme-default-content content__default"><div class="custom-block tip"><p class="custom-block-title">TIP</p> <p>本文介绍了最大熵模型。</p></div> <h3 id="用于分类问题"><a href="#用于分类问题" class="header-anchor">#</a> 用于分类问题</h3> <p>最大熵模型用于分类问题，其基本假设是：在满足约束条件的情况下，条件熵最大的模型就是最好的模型。正如一个著名的投资理念“鸡蛋不要放在一个篮子里”。</p> <p>目标函数是条件熵，我们希望它最大，即条件概率的期望最大。</p> <p>已知信息成为目标函数的约束，是客观存在的，必须得到满足。其它不作任何假设，越随机越合理。有多少个特征函数，就有多少个约束，再加上条件概率之和为 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mn class="mjx-n"><mjx-c c="1"></mjx-c></mjx-mn></mjx-math></mjx-container>。</p> <h3 id="认识条件熵"><a href="#认识条件熵" class="header-anchor">#</a> 认识条件熵</h3> <p>条件熵是熵的期望，它度量了我们的 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="Y"></mjx-c></mjx-mi></mjx-math></mjx-container> 在知道了 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="X"></mjx-c></mjx-mi></mjx-math></mjx-container> 以后的不确定性，条件熵定义的最原始形式是：</p> <p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math><semantics><mrow><mi>H</mi><mo>(</mo><mi>Y</mi><mi mathvariant="normal">∣</mi><mi>X</mi><mo>)</mo><mo>=</mo><msub><mo>∑</mo><mrow><mi>x</mi><mo>∈</mo><mi>X</mi></mrow></msub><mi>p</mi><mo>(</mo><mi>x</mi><mo>)</mo><mi>H</mi><mo>(</mo><mi>Y</mi><mi mathvariant="normal">∣</mi><mi>X</mi><mo>=</mo><mi>x</mi><mo>)</mo></mrow><annotation encoding="application/x-tex">H(Y|X)=\sum_{x\in X} p(x)H(Y|X=x)
</annotation></semantics></math></span><span aria-hidden="true" class="katex-html"><span class="strut" style="height:1.050005em;"></span><span class="strut bottom" style="height:2.3717110000000003em;vertical-align:-1.321706em;"></span><span class="base displaystyle textstyle uncramped"><span class="mord mathit" style="margin-right:0.08125em;">H</span><span class="mopen">(</span><span class="mord mathit" style="margin-right:0.22222em;">Y</span><span class="mord mathrm">∣</span><span class="mord mathit" style="margin-right:0.07847em;">X</span><span class="mclose">)</span><span class="mrel">=</span><span class="mop op-limits"><span class="vlist"><span style="top:1.194336em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord scriptstyle cramped"><span class="mord mathit">x</span><span class="mrel">∈</span><span class="mord mathit" style="margin-right:0.07847em;">X</span></span></span></span><span style="top:-0.000005000000000032756em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span><span class="op-symbol large-op mop">∑</span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mord mathit">p</span><span class="mopen">(</span><span class="mord mathit">x</span><span class="mclose">)</span><span class="mord mathit" style="margin-right:0.08125em;">H</span><span class="mopen">(</span><span class="mord mathit" style="margin-right:0.22222em;">Y</span><span class="mord mathrm">∣</span><span class="mord mathit" style="margin-right:0.07847em;">X</span><span class="mrel">=</span><span class="mord mathit">x</span><span class="mclose">)</span></span></span></span></span></p> <p>或者写成：</p> <p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math><semantics><mrow><mi>H</mi><mo>(</mo><mi>Y</mi><mi mathvariant="normal">∣</mi><mi>X</mi><mo>)</mo><mo>=</mo><msubsup><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mrow><mi>n</mi></mrow></msubsup><mi>p</mi><mo>(</mo><msub><mi>x</mi><mi>i</mi></msub><mo>)</mo><mi>H</mi><mo>(</mo><mi>Y</mi><mi mathvariant="normal">∣</mi><mi>X</mi><mo>=</mo><msub><mi>x</mi><mi>i</mi></msub><mo>)</mo></mrow><annotation encoding="application/x-tex">H(Y|X)=\sum_{i=1}^{n} p(x_i)H(Y|X=x_i)
</annotation></semantics></math></span><span aria-hidden="true" class="katex-html"><span class="strut" style="height:1.6513970000000002em;"></span><span class="strut bottom" style="height:2.929066em;vertical-align:-1.277669em;"></span><span class="base displaystyle textstyle uncramped"><span class="mord mathit" style="margin-right:0.08125em;">H</span><span class="mopen">(</span><span class="mord mathit" style="margin-right:0.22222em;">Y</span><span class="mord mathrm">∣</span><span class="mord mathit" style="margin-right:0.07847em;">X</span><span class="mclose">)</span><span class="mrel">=</span><span class="mop op-limits"><span class="vlist"><span style="top:1.1776689999999999em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord scriptstyle cramped"><span class="mord mathit">i</span><span class="mrel">=</span><span class="mord mathrm">1</span></span></span></span><span style="top:-0.000005000000000143778em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span><span class="op-symbol large-op mop">∑</span></span></span><span style="top:-1.2500050000000003em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle uncramped"><span class="mord scriptstyle uncramped"><span class="mord mathit">n</span></span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mord mathit">p</span><span class="mopen">(</span><span class="mord"><span class="mord mathit">x</span><span class="vlist"><span style="top:0.15em;margin-right:0.05em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord mathit">i</span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mclose">)</span><span class="mord mathit" style="margin-right:0.08125em;">H</span><span class="mopen">(</span><span class="mord mathit" style="margin-right:0.22222em;">Y</span><span class="mord mathrm">∣</span><span class="mord mathit" style="margin-right:0.07847em;">X</span><span class="mrel">=</span><span class="mord"><span class="mord mathit">x</span><span class="vlist"><span style="top:0.15em;margin-right:0.05em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord mathit">i</span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mclose">)</span></span></span></span></span></p> <p>条件熵表示在已知随机变量 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="X"></mjx-c></mjx-mi></mjx-math></mjx-container> 的条件下，<mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="Y"></mjx-c></mjx-mi></mjx-math></mjx-container> 的<strong>条件概率分布</strong>的熵<strong>对随机变量 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="X"></mjx-c></mjx-mi></mjx-math></mjx-container></strong> 的数学期望。熵是数学期望（信息量的数学期望），条件熵也是数学期望，是数学期望的数学期望，有点拗口，不妨把定义多看几遍，就清楚了。</p> <p>这里又假设随机变量 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="Y"></mjx-c></mjx-mi></mjx-math></mjx-container> 有 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="m"></mjx-c></mjx-mi></mjx-math></mjx-container> 个取值，将 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="H"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="Y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="|"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="X"></mjx-c></mjx-mi><mjx-mo space="4" class="mjx-n"><mjx-c c="="></mjx-c></mjx-mo><mjx-msub space="4"><mjx-mi noIC="true" class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-script style="vertical-align:-0.15em;"><mjx-mi size="s" class="mjx-i"><mjx-c c="i"></mjx-c></mjx-mi></mjx-script></mjx-msub><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 用定义式</p> <p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math><semantics><mrow><mi>H</mi><mo>(</mo><mi>Y</mi><mi mathvariant="normal">∣</mi><mi>X</mi><mo>=</mo><msub><mi>x</mi><mi>i</mi></msub><mo>)</mo><mo>=</mo><mo>−</mo><msubsup><mo>∑</mo><mrow><mi>j</mi><mo>=</mo><mn>1</mn></mrow><mrow><mi>m</mi></mrow></msubsup><mi>p</mi><mo>(</mo><msub><mi>y</mi><mi>j</mi></msub><mi mathvariant="normal">∣</mi><mi>X</mi><mo>=</mo><msub><mi>x</mi><mi>i</mi></msub><mo>)</mo><mi>log</mi><mi>p</mi><mo>(</mo><msub><mi>y</mi><mi>j</mi></msub><mi mathvariant="normal">∣</mi><mi>X</mi><mo>=</mo><msub><mi>x</mi><mi>i</mi></msub><mo>)</mo></mrow><annotation encoding="application/x-tex">H(Y|X=x_i) = - \sum_{j=1}^{m} p(y_j|X=x_i)\log p(y_j|X=x_i)
</annotation></semantics></math></span><span aria-hidden="true" class="katex-html"><span class="strut" style="height:1.6513970000000007em;"></span><span class="strut bottom" style="height:3.0651740000000007em;vertical-align:-1.4137769999999998em;"></span><span class="base displaystyle textstyle uncramped"><span class="mord mathit" style="margin-right:0.08125em;">H</span><span class="mopen">(</span><span class="mord mathit" style="margin-right:0.22222em;">Y</span><span class="mord mathrm">∣</span><span class="mord mathit" style="margin-right:0.07847em;">X</span><span class="mrel">=</span><span class="mord"><span class="mord mathit">x</span><span class="vlist"><span style="top:0.15em;margin-right:0.05em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord mathit">i</span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mclose">)</span><span class="mrel">=</span><span class="mord">−</span><span class="mop op-limits"><span class="vlist"><span style="top:1.1776689999999999em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord scriptstyle cramped"><span class="mord mathit" style="margin-right:0.05724em;">j</span><span class="mrel">=</span><span class="mord mathrm">1</span></span></span></span><span style="top:-0.000005000000000254801em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span><span class="op-symbol large-op mop">∑</span></span></span><span style="top:-1.2500050000000005em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle uncramped"><span class="mord scriptstyle uncramped"><span class="mord mathit">m</span></span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mord mathit">p</span><span class="mopen">(</span><span class="mord"><span class="mord mathit" style="margin-right:0.03588em;">y</span><span class="vlist"><span style="top:0.15em;margin-right:0.05em;margin-left:-0.03588em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord mathit" style="margin-right:0.05724em;">j</span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mord mathrm">∣</span><span class="mord mathit" style="margin-right:0.07847em;">X</span><span class="mrel">=</span><span class="mord"><span class="mord mathit">x</span><span class="vlist"><span style="top:0.15em;margin-right:0.05em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord mathit">i</span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mclose">)</span><span class="mop">lo<span style="margin-right:0.01389em;">g</span></span><span class="mord mathit">p</span><span class="mopen">(</span><span class="mord"><span class="mord mathit" style="margin-right:0.03588em;">y</span><span class="vlist"><span style="top:0.15em;margin-right:0.05em;margin-left:-0.03588em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord mathit" style="margin-right:0.05724em;">j</span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mord mathrm">∣</span><span class="mord mathit" style="margin-right:0.07847em;">X</span><span class="mrel">=</span><span class="mord"><span class="mord mathit">x</span><span class="vlist"><span style="top:0.15em;margin-right:0.05em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord mathit">i</span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mclose">)</span></span></span></span></span></p> <p>代入上式，得</p>
\begin{eqnarray}
H(Y|X) &amp;=&amp; \sum_{i=1}^{n} p(x_i)H(Y|X=x_i) \\
&amp;=&amp; \sum_{i=1}^{n} p(x_i)\left(- \sum_{j=1}^{m} p(y_j|X=x_i) \log p(y_j|X=x_i)\right) \\
&amp;=&amp; -\sum_{i=1}^{n}p(x_i) \sum_{j=1}^{m} p(y_j|x_i) \log p(y_j|x_i)
\end{eqnarray}

<p>即</p> <p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math><semantics><mrow><mi>H</mi><mo>(</mo><mi>Y</mi><mi mathvariant="normal">∣</mi><mi>X</mi><mo>)</mo><mo>=</mo><msubsup><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mrow><mi>n</mi></mrow></msubsup><mi>p</mi><mo>(</mo><msub><mi>x</mi><mi>i</mi></msub><mo>)</mo><mi>H</mi><mo>(</mo><mi>Y</mi><mi mathvariant="normal">∣</mi><mi>X</mi><mo>=</mo><msub><mi>x</mi><mi>i</mi></msub><mo>)</mo><mo>=</mo><mo>−</mo><msubsup><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mrow><mi>n</mi></mrow></msubsup><mi>p</mi><mo>(</mo><msub><mi>x</mi><mi>i</mi></msub><mo>)</mo><msubsup><mo>∑</mo><mrow><mi>j</mi><mo>=</mo><mn>1</mn></mrow><mrow><mi>m</mi></mrow></msubsup><mi>p</mi><mo>(</mo><msub><mi>y</mi><mi>j</mi></msub><mi mathvariant="normal">∣</mi><msub><mi>x</mi><mi>i</mi></msub><mo>)</mo><mi>log</mi><mi>p</mi><mo>(</mo><msub><mi>y</mi><mi>j</mi></msub><mi mathvariant="normal">∣</mi><msub><mi>x</mi><mi>i</mi></msub><mo>)</mo></mrow><annotation encoding="application/x-tex">H(Y|X)=\sum_{i=1}^{n} p(x_i)H(Y|X=x_i) =-\sum_{i=1}^{n}p(x_i) \sum_{j=1}^{m} p(y_j|x_i) \log p(y_j|x_i)
</annotation></semantics></math></span><span aria-hidden="true" class="katex-html"><span class="strut" style="height:1.6513970000000007em;"></span><span class="strut bottom" style="height:3.0651740000000007em;vertical-align:-1.4137769999999998em;"></span><span class="base displaystyle textstyle uncramped"><span class="mord mathit" style="margin-right:0.08125em;">H</span><span class="mopen">(</span><span class="mord mathit" style="margin-right:0.22222em;">Y</span><span class="mord mathrm">∣</span><span class="mord mathit" style="margin-right:0.07847em;">X</span><span class="mclose">)</span><span class="mrel">=</span><span class="mop op-limits"><span class="vlist"><span style="top:1.1776689999999999em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord scriptstyle cramped"><span class="mord mathit">i</span><span class="mrel">=</span><span class="mord mathrm">1</span></span></span></span><span style="top:-0.000005000000000143778em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span><span class="op-symbol large-op mop">∑</span></span></span><span style="top:-1.2500050000000003em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle uncramped"><span class="mord scriptstyle uncramped"><span class="mord mathit">n</span></span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mord mathit">p</span><span class="mopen">(</span><span class="mord"><span class="mord mathit">x</span><span class="vlist"><span style="top:0.15em;margin-right:0.05em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord mathit">i</span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mclose">)</span><span class="mord mathit" style="margin-right:0.08125em;">H</span><span class="mopen">(</span><span class="mord mathit" style="margin-right:0.22222em;">Y</span><span class="mord mathrm">∣</span><span class="mord mathit" style="margin-right:0.07847em;">X</span><span class="mrel">=</span><span class="mord"><span class="mord mathit">x</span><span class="vlist"><span style="top:0.15em;margin-right:0.05em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord mathit">i</span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mclose">)</span><span class="mrel">=</span><span class="mord">−</span><span class="mop op-limits"><span class="vlist"><span style="top:1.1776689999999999em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord scriptstyle cramped"><span class="mord mathit">i</span><span class="mrel">=</span><span class="mord mathrm">1</span></span></span></span><span style="top:-0.000005000000000143778em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span><span class="op-symbol large-op mop">∑</span></span></span><span style="top:-1.2500050000000003em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle uncramped"><span class="mord scriptstyle uncramped"><span class="mord mathit">n</span></span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mord mathit">p</span><span class="mopen">(</span><span class="mord"><span class="mord mathit">x</span><span class="vlist"><span style="top:0.15em;margin-right:0.05em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord mathit">i</span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mclose">)</span><span class="mop op-limits"><span class="vlist"><span style="top:1.1776689999999999em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord scriptstyle cramped"><span class="mord mathit" style="margin-right:0.05724em;">j</span><span class="mrel">=</span><span class="mord mathrm">1</span></span></span></span><span style="top:-0.000005000000000254801em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span><span class="op-symbol large-op mop">∑</span></span></span><span style="top:-1.2500050000000005em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle uncramped"><span class="mord scriptstyle uncramped"><span class="mord mathit">m</span></span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mord mathit">p</span><span class="mopen">(</span><span class="mord"><span class="mord mathit" style="margin-right:0.03588em;">y</span><span class="vlist"><span style="top:0.15em;margin-right:0.05em;margin-left:-0.03588em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord mathit" style="margin-right:0.05724em;">j</span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mord mathrm">∣</span><span class="mord"><span class="mord mathit">x</span><span class="vlist"><span style="top:0.15em;margin-right:0.05em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord mathit">i</span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mclose">)</span><span class="mop">lo<span style="margin-right:0.01389em;">g</span></span><span class="mord mathit">p</span><span class="mopen">(</span><span class="mord"><span class="mord mathit" style="margin-right:0.03588em;">y</span><span class="vlist"><span style="top:0.15em;margin-right:0.05em;margin-left:-0.03588em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord mathit" style="margin-right:0.05724em;">j</span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mord mathrm">∣</span><span class="mord"><span class="mord mathit">x</span><span class="vlist"><span style="top:0.15em;margin-right:0.05em;margin-left:0em;"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span><span class="reset-textstyle scriptstyle cramped"><span class="mord mathit">i</span></span></span><span class="baseline-fix"><span class="fontsize-ensurer reset-size5 size5"><span style="font-size:0em;">​</span></span>​</span></span></span><span class="mclose">)</span></span></span></span></span></p> <p>条件熵即按一个新的变量的每个值对原变量进行分类，比如《统计学习方法》中将是否同意贷款按照申请人是否有自己的房子分为两类。然后在每一个小类里面，都计算熵，然后每一个熵乘以各个类别的概率，然后求和（加权求和）。我们用另一个变量对原变量分类后，原变量的不确定性就会减小了，因为新增了 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="Y"></mjx-c></mjx-mi></mjx-math></mjx-container> 的信息，不确定程度减少了多少就是信息的增益。</p> <p>经验条件熵如何计算：<mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="H"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="D"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="|"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="A"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container>。</p>
\begin{equation}\begin{split}
H(D|A) &amp;= -\sum_{i,k}p(D_k,A_i)\log p(D_k,A_i) \\
&amp;= -\sum_{i,k}p(A_i)p(D_k|A_i)\log p(D_k,A_i)【这一步把 p(D_k|A_i) 化开】\\
&amp;= -\sum_{i=1}^{n}\sum_{k=1}^{K} p(A_i)p(D_k,A_i) \log p(D_k,A_i) \\
&amp;= -\sum_{i=1}^{n} p(A_i) \sum_{k=1}^{K} p(D_k|A_i) \log p(D_k|A_i) \\
&amp;= -\sum_{i=1}^{n} \frac{|D_i|}{|D|}\sum_{k=1}^{K} \frac{|D_{ik}|}{|D_i|} \log \frac{|D_{ik}|}{|D_i|}
\end{split}\end{equation}

<h3 id="决策树中条件熵和最大熵原理中的条件熵"><a href="#决策树中条件熵和最大熵原理中的条件熵" class="header-anchor">#</a> 决策树中条件熵和最大熵原理中的条件熵</h3> <p>决策树中特征选择，希望条件熵越小越好，这样信息增益就大，此时我们是在选最佳特征，是在已知特征中找最能把类标分清楚的特征。</p> <p>在最大熵原理中，我们希望选择的模型“越平均越好”，选择的是概率模型，即在未知的概率模型中，选择一个除了满足已知信息以外，不做任何其它假设的概率模型，我们希望条件熵最大。</p> <h3 id="理解特征函数"><a href="#理解特征函数" class="header-anchor">#</a> 理解特征函数</h3> <p>输入 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi></mjx-math></mjx-container> 和输出 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi></mjx-math></mjx-container> 之间的关系，使用特征函数 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="f"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi space="2" class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 来表示，可以想象 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi></mjx-math></mjx-container> 取某个值的时候，<mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi></mjx-math></mjx-container> 就有一个值与之对应，它描述的是输入变量和输出变量的关系，用特征函数来刻画：</p>
f(x,y)=\begin{cases}
1, &amp; x 与 y 满足某一事实 \\
0, &amp; 否则 \\
\end{cases}

<p>特征函数是把自变量和因变量的关系都考虑进去的函数。可以在皮果提的文章：<a href="https://blog.csdn.net/itplus/article/details/26550201" target="_blank" rel="noopener noreferrer">最大熵学习笔记（三）最大熵模型<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a> 中找到具体的例子，理解特征函数的定义。</p> <p>还可以参考这篇知乎问答：<a href="https://www.zhihu.com/question/24094554" target="_blank" rel="noopener noreferrer">如何理解最大熵模型里面的特征？<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a>。</p> <p><img src="https://upload-images.jianshu.io/upload_images/414598-afc819298fe1f760.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240" alt="image.png"></p> <p><img src="https://upload-images.jianshu.io/upload_images/414598-edb5dce2aa80d62f.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240" alt="image.png"></p> <h3 id="公式推导"><a href="#公式推导" class="header-anchor">#</a> 公式推导</h3> <p>1、计算两个经验分布，所谓经验分布，就是从训练数据集中统计出来的频率值。只要是经验分布，我们在表示概率的时候，都在上面加上“一弯”进行区别。从训练数据集可以得到：</p> <p>（1）联合分布 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="Y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="|"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="X"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 的经验分布：</p>
\widetilde P(X=x,Y=y) = \frac{v(X=x,Y=y)}{N}

<p>（2）边缘分布 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="X"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 的经验分布：</p>
\widetilde P(X=x) = \frac{v(X=x)}{N}

<p>其中 $v(X=x,Y=y) $  表示训练样本中 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi space="2" class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 出现的频数；<mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="v"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="X"></mjx-c></mjx-mi><mjx-mo space="4" class="mjx-n"><mjx-c c="="></mjx-c></mjx-mo><mjx-mi space="4" class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 表示训练数据集中输入 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi></mjx-math></mjx-container> 出现的频数，<mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="N"></mjx-c></mjx-mi></mjx-math></mjx-container> 表示训练样本容量。</p> <p>2、通过这两个经验分布，建立与特征函数和目标函数的等式，形成约束条件。因为 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi space="2" class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 未知，但是可以用 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-TeXAtom><mjx-mover><mjx-over style="padding-bottom:0.06em;padding-left:0.263em;margin-bottom:-0.597em;"><mjx-mo class="mjx-sop"><mjx-c c="2DC"></mjx-c></mjx-mo></mjx-over><mjx-base><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi></mjx-base></mjx-mover></mjx-TeXAtom><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi space="2" class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 代替，而<mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="|"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 为所求，又有 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi space="2" class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo><mjx-mo space="4" class="mjx-n"><mjx-c c="="></mjx-c></mjx-mo><mjx-mi space="4" class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="|"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container>，<mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 也未知，但是可以用 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-TeXAtom><mjx-mover><mjx-over style="padding-bottom:0.06em;padding-left:0.263em;margin-bottom:-0.597em;"><mjx-mo class="mjx-sop"><mjx-c c="2DC"></mjx-c></mjx-mo></mjx-over><mjx-base><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi></mjx-base></mjx-mover></mjx-TeXAtom><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 代替，于是就有：</p> <p>（1）特征函数 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="f"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi space="2" class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 关于经验分布（训练数据集的分布）$\widetilde P(X,Y) $ 的期望值：</p>
E_{\widetilde P}(f) = \sum_{x,y}\widetilde P(x,y)f(x,y)

<p>（2）特征函数 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="f"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi space="2" class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 关于模型 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="Y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="|"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="X"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 与经验分布 $\widetilde P(X) $ 的期望值：</p>
E_{P}(f) =\sum_{x,y} \widetilde P(x)P(y|x)f(x,y)

<p>如果模型能够获取从训练数据集中的信息，那么就可以假设这两个期望值相等。</p>
E_{P}(f)  = E_{\widetilde P}(f) 

<p>即：</p>
\sum_{x,y} \widetilde P(x)P(y|x)f(x,y) = \sum_{x,y}\widetilde P(x,y)f(x,y)

<p>上面的两个等式作为条件概率的最大熵的约束条件。如果有 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="n"></mjx-c></mjx-mi></mjx-math></mjx-container> 个特征函数，那么就有 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="n"></mjx-c></mjx-mi></mjx-math></mjx-container> 个约束条件。</p> <p>于是最大熵模型就可以表示成：
1、目标函数：条件熵，最优化的方向是使其最大。</p> <p>2、约束条件：
（1）有多少个特征函数，就有多少个约束条件。
（2）条件概率之和为 1。</p> <p>3、有约束的最大化问题可以使用拉格朗日乘子法。
引入拉格朗日乘子，写出拉格朗日函数。接下来的步骤和 SVM 是一样的。
利用拉格朗日对偶性，<strong>求解对偶问题</strong>。即：原始问题是 $ \min_p \max_w$ 的，对偶问题是 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-munder limits="false"><mjx-mo class="mjx-n"><mjx-c c="m"></mjx-c><mjx-c c="a"></mjx-c><mjx-c c="x"></mjx-c></mjx-mo><mjx-script style="vertical-align:-0.15em;"><mjx-mi size="s" class="mjx-i"><mjx-c c="w"></mjx-c></mjx-mi></mjx-script></mjx-munder><mjx-munder space="2" limits="false"><mjx-mo class="mjx-n"><mjx-c c="m"></mjx-c><mjx-c c="i"></mjx-c><mjx-c c="n"></mjx-c></mjx-mo><mjx-script style="vertical-align:-0.15em;"><mjx-mi size="s" class="mjx-i"><mjx-c c="p"></mjx-c></mjx-mi></mjx-script></mjx-munder></mjx-math></mjx-container> 的。</p> <blockquote><p>最大熵模型的学习可以归结为对偶函数的极大化问题。对偶函数的极大化等价于最大熵模型的极大似然估计。</p></blockquote> <p>总结：</p>
\sum_{x,y}\widetilde P(x,y)f(x,y) = \sum_{x,y}P(x,y)f(x,y)

<p>左边 </p><p><mjx-container jax="CHTML" display="true" class="MathJax"><mjx-math display="true" class=" MJX-TEX"><mjx-msub><mjx-mi noIC="true" class="mjx-i"><mjx-c c="E"></mjx-c></mjx-mi><mjx-script style="vertical-align:-0.331em;"><mjx-TeXAtom size="s"><mjx-TeXAtom><mjx-mover><mjx-over style="padding-bottom:0.06em;padding-left:0.263em;margin-bottom:-0.597em;"><mjx-mo class="mjx-sop"><mjx-c c="2DC"></mjx-c></mjx-mo></mjx-over><mjx-base><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi></mjx-base></mjx-mover></mjx-TeXAtom></mjx-TeXAtom></mjx-script></mjx-msub><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="f"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo><mjx-mo space="4" class="mjx-n"><mjx-c c="="></mjx-c></mjx-mo><mjx-munder space="4"><mjx-row><mjx-base><mjx-mo class="mjx-lop"><mjx-c c="2211"></mjx-c></mjx-mo></mjx-base></mjx-row><mjx-row><mjx-under style="padding-top:0.287em;padding-left:0.248em;"><mjx-TeXAtom size="s"><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi></mjx-TeXAtom></mjx-under></mjx-row></mjx-munder><mjx-TeXAtom space="2"><mjx-mover><mjx-over style="padding-bottom:0.06em;padding-left:0.263em;margin-bottom:-0.597em;"><mjx-mo class="mjx-sop"><mjx-c c="2DC"></mjx-c></mjx-mo></mjx-over><mjx-base><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi></mjx-base></mjx-mover></mjx-TeXAtom><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi space="2" class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="f"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi space="2" class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container></p> 表示<strong>特征函数 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="f"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi space="2" class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 在训练数据集上关于经验分布 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-TeXAtom><mjx-mover><mjx-over style="padding-bottom:0.06em;padding-left:0.263em;margin-bottom:-0.597em;"><mjx-mo class="mjx-sop"><mjx-c c="2DC"></mjx-c></mjx-mo></mjx-over><mjx-base><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi></mjx-base></mjx-mover></mjx-TeXAtom><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi space="2" class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 的数学期望</strong>；<p></p> <p>右边 </p><p><mjx-container jax="CHTML" display="true" class="MathJax"><mjx-math display="true" class=" MJX-TEX"><mjx-msub><mjx-mi noIC="true" class="mjx-i"><mjx-c c="E"></mjx-c></mjx-mi><mjx-script style="vertical-align:-0.15em;"><mjx-TeXAtom size="s"><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi></mjx-TeXAtom></mjx-script></mjx-msub><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="f"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo><mjx-mo space="4" class="mjx-n"><mjx-c c="="></mjx-c></mjx-mo><mjx-munder space="4"><mjx-row><mjx-base><mjx-mo class="mjx-lop"><mjx-c c="2211"></mjx-c></mjx-mo></mjx-base></mjx-row><mjx-row><mjx-under style="padding-top:0.287em;padding-left:0.248em;"><mjx-TeXAtom size="s"><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi></mjx-TeXAtom></mjx-under></mjx-row></mjx-munder><mjx-mi space="2" class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi space="2" class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="f"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi space="2" class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container></p> 表示<strong>特征函数 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="f"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi space="2" class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 在模型上关于理论分布 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mi space="2" class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 的数学期望</strong>。<p></p> <p>下面是我的手写笔记：</p> <p><img src="http://upload-images.jianshu.io/upload_images/414598-8122291c146c2101.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240" alt="最大熵模型手写笔记-1"></p> <p><img src="http://upload-images.jianshu.io/upload_images/414598-cfca49dec94100d9.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240" alt="最大熵模型手写笔记-2"></p> <p><img src="http://upload-images.jianshu.io/upload_images/414598-d8c98a87df043417.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240" alt="最大熵模型手写笔记-3"></p> <h3 id="最大熵模型的优点"><a href="#最大熵模型的优点" class="header-anchor">#</a> 最大熵模型的优点</h3> <p>最大熵模型在分类方法里算是比较优的模型，但是由于它的约束函数的数目一般来说会随着样本量的增大而增大，导致<strong>样本量很大的时候，对偶函数优化求解的迭代过程非常慢</strong>，scikit-learn 甚至都没有最大熵模型对应的类库。但是理解它仍然很有意义，尤其是它和很多分类方法都有千丝万缕的联系。</p> <p>最大熵模型的优点有：</p> <p>1、最大熵统计模型获得的是所有满足约束条件的模型中条件熵极大的模型，作为经典的分类模型时准确率较高；</p> <p>2、可以灵活地设置约束条件，通过约束条件的多少可以调节模型对未知数据的适应度和对已知数据的拟合程度。</p> <h3 id="最大熵模型的缺点"><a href="#最大熵模型的缺点" class="header-anchor">#</a> 最大熵模型的缺点</h3> <p>由于约束函数数量和样本数目有关系，导致迭代过程计算量巨大，实际应用比较难。</p> <h2 id="参考资料"><a href="#参考资料" class="header-anchor">#</a> 参考资料</h2> <p>[1] 李航. 统计学习方法（第 2 版）第 6 章“最大熵模型”. 北京：清华大学出版社，2019.</p> <p>[2] 吴军. 数学之美（第 20 章“不要把鸡蛋放到一个篮子里”）. 北京：人民邮电出版社，2012.</p> <p>[3]皮果提的系列文章：
<a href="http://blog.csdn.net/itplus/article/details/26550597" target="_blank" rel="noopener noreferrer">最大熵学习笔记（零）目录和引言<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a> <a href="http://blog.csdn.net/itplus/article/details/26549871" target="_blank" rel="noopener noreferrer">最大熵学习笔记（一）预备知识<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a> <a href="http://blog.csdn.net/itplus/article/details/26550127" target="_blank" rel="noopener noreferrer">最大熵学习笔记（二）最大熵原理<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a> <a href="http://blog.csdn.net/itplus/article/details/26550201" target="_blank" rel="noopener noreferrer">最大熵学习笔记（三）最大熵模型<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a> <a href="http://blog.csdn.net/itplus/article/details/26550273" target="_blank" rel="noopener noreferrer">最大熵学习笔记（四）模型求解<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a> <a href="http://blog.csdn.net/itplus/article/details/26550369" target="_blank" rel="noopener noreferrer">最大熵学习笔记（五）最优化算法<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a> <a href="http://blog.csdn.net/itplus/article/details/26550451" target="_blank" rel="noopener noreferrer">最大熵学习笔记（六）优缺点分析<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <hr> <p>以下为草稿，我自己留着备用，读者可以忽略，欢迎大家批评指正。</p> <h3 id="参考资料-2"><a href="#参考资料-2" class="header-anchor">#</a> 参考资料</h3> <p>函数 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo space="4" class="mjx-n"><mjx-c c="="></mjx-c></mjx-mo><mjx-mi space="4" class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo space="3" class="mjx-n"><mjx-c c="2212"></mjx-c></mjx-mo><mjx-mn space="3" class="mjx-n"><mjx-c c="1"></mjx-c></mjx-mn></mjx-math></mjx-container> 与 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="y"></mjx-c></mjx-mi><mjx-mo space="4" class="mjx-n"><mjx-c c="="></mjx-c></mjx-mo><mjx-mi space="4" class="mjx-i"><mjx-c c="l"></mjx-c></mjx-mi><mjx-mi class="mjx-i"><mjx-c c="n"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c="("></mjx-c></mjx-mo><mjx-mi class="mjx-i"><mjx-c c="x"></mjx-c></mjx-mi><mjx-mo class="mjx-n"><mjx-c c=")"></mjx-c></mjx-mo></mjx-math></mjx-container> 的图像：</p> <p><img src="https://upload-images.jianshu.io/upload_images/414598-c6ec4300e573ac0b.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240" alt="image.png"></p> <p><img src="https://upload-images.jianshu.io/upload_images/414598-e9c222994150e710.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240" alt="image.png"></p> <p>1、温暖的蛋壳：最大熵模型
地址：http://47.96.227.134/index.php/2018/06/17/maxentropy/
说明：这篇文章指出了：最大熵模型是一种很抽象的描述，在不同的场景和业务中我们可能需要定义不同的特征函数求出对应的权重值，来解决不同的问题。</p> <p>2、码农场：逻辑斯谛回归与最大熵模型
地址：http://www.hankcs.com/ml/the-logistic-regression-and-the-maximum-entropy-model.html</p> <p>3、统计学习方法|最大熵原理剖析及实现
地址：<a href="https://www.pkudodo.com/2018/12/05/1-7/" target="_blank" rel="noopener noreferrer">https://www.pkudodo.com/2018/12/05/1-7/<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p>4、白马负金羁：<a href="http://blog.csdn.net/baimafujinji/article/details/78986906" target="_blank" rel="noopener noreferrer">最大熵模型（MaxEnt）：万法归宗（上）<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p><a href="https://blog.csdn.net/baimafujinji/article/details/78992878" target="_blank" rel="noopener noreferrer">最大熵模型（MaxEnt）：万法归宗（下）<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p>5、<a href="https://blog.csdn.net/tina_ttl/article/details/53542004" target="_blank" rel="noopener noreferrer">李航·统计学习方法笔记·第6章 logistic regression与最大熵模型（2）·最大熵模型<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p>6、马同学：<a href="https://www.matongxue.com/madocs/939.html" target="_blank" rel="noopener noreferrer">如何理解拉格朗日乘子法？<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p>7、自然语言处理之一：最大熵模型</p> <p><a href="https://blog.csdn.net/chjjunking/article/details/6452051" target="_blank" rel="noopener noreferrer">https://blog.csdn.net/chjjunking/article/details/6452051<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p>8、李航统计学习－最大熵模型之我的理解</p> <p><a href="https://blog.csdn.net/szq34_2008/article/details/79186664" target="_blank" rel="noopener noreferrer">https://blog.csdn.net/szq34_2008/article/details/79186664<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p>9、最大熵原理</p> <p><a href="https://blog.csdn.net/longshenlmj/article/details/8697472" target="_blank" rel="noopener noreferrer">https://blog.csdn.net/longshenlmj/article/details/8697472<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p>10、最大熵与逻辑回归的等价性</p> <p><a href="https://blog.csdn.net/buring_/article/details/43342341" target="_blank" rel="noopener noreferrer">https://blog.csdn.net/buring_/article/details/43342341<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p>12、统计学习-逻辑回归(LR)和最大熵模型</p> <p><a href="https://blog.csdn.net/u013207865/article/details/52728137" target="_blank" rel="noopener noreferrer">https://blog.csdn.net/u013207865/article/details/52728137<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p>13、机器学习——最大熵原理</p> <p><a href="https://blog.csdn.net/u012654283/article/details/45505711" target="_blank" rel="noopener noreferrer">https://blog.csdn.net/u012654283/article/details/45505711<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p>14、最大熵模型与GIS ，IIS算法</p> <p><a href="https://blog.csdn.net/u014688145/article/details/55003910" target="_blank" rel="noopener noreferrer">https://blog.csdn.net/u014688145/article/details/55003910<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p>15、机器学习笔记（二十）——求解最大熵模型</p> <p><a href="https://blog.csdn.net/chunyun0716/article/details/53574630" target="_blank" rel="noopener noreferrer">https://blog.csdn.net/chunyun0716/article/details/53574630<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p>在机器学习中，<mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi></mjx-math></mjx-container> 往往用来表示样本的真实分布，比如 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mo class="mjx-n"><mjx-c c="["></mjx-c></mjx-mo><mjx-mn class="mjx-n"><mjx-c c="1"></mjx-c></mjx-mn><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mn space="2" class="mjx-n"><mjx-c c="0"></mjx-c></mjx-mn><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mn space="2" class="mjx-n"><mjx-c c="0"></mjx-c></mjx-mn><mjx-mo class="mjx-n"><mjx-c c="]"></mjx-c></mjx-mo></mjx-math></mjx-container> 表示当前样本属于第一类。<mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="Q"></mjx-c></mjx-mi></mjx-math></mjx-container> 用来表示模型所预测的分布，比如 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mo class="mjx-n"><mjx-c c="["></mjx-c></mjx-mo><mjx-mn class="mjx-n"><mjx-c c="0"></mjx-c><mjx-c c="."></mjx-c><mjx-c c="7"></mjx-c></mjx-mn><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mn space="2" class="mjx-n"><mjx-c c="0"></mjx-c><mjx-c c="."></mjx-c><mjx-c c="2"></mjx-c></mjx-mn><mjx-mo class="mjx-n"><mjx-c c=","></mjx-c></mjx-mo><mjx-mn space="2" class="mjx-n"><mjx-c c="0"></mjx-c><mjx-c c="."></mjx-c><mjx-c c="1"></mjx-c></mjx-mn><mjx-mo class="mjx-n"><mjx-c c="]"></mjx-c></mjx-mo></mjx-math></mjx-container>。即 <mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="P"></mjx-c></mjx-mi></mjx-math></mjx-container> 表示真实类别的独热编码，<mjx-container jax="CHTML" class="MathJax"><mjx-math class=" MJX-TEX"><mjx-mi class="mjx-i"><mjx-c c="Q"></mjx-c></mjx-mi></mjx-math></mjx-container> 表示预测的概率分布。</p> <p>16、机器学习笔记（十九）——最大熵原理和模型定义</p> <p><a href="https://blog.csdn.net/chunyun0716/article/details/53365968" target="_blank" rel="noopener noreferrer">https://blog.csdn.net/chunyun0716/article/details/53365968<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p>17、一步一步理解最大熵模型</p> <p><a href="https://www.cnblogs.com/wxquare/p/5858008.html" target="_blank" rel="noopener noreferrer">https://www.cnblogs.com/wxquare/p/5858008.html<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p>18、李航·统计学习方法笔记·第6章 logistic regression与最大熵模型（2）·最大熵模型</p> <p><a href="https://blog.csdn.net/tina_ttl/article/details/53542004" target="_blank" rel="noopener noreferrer">https://blog.csdn.net/tina_ttl/article/details/53542004<span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></p> <p>（本节完）</p></div> <footer class="page-edit"><!----> <div class="last-updated"><span class="prefix">上次更新:</span> <span class="time">4/10/2021, 6:19:58 PM</span></div></footer> <!----> </main></div><div class="global-ui"><!----></div></div>
    <script src="/book/assets/js/app.a4355b0d.js" defer></script><script src="/book/assets/js/2.ba785ee6.js" defer></script><script src="/book/assets/js/47.e7767237.js" defer></script>
  </body>
</html>
